Skip to content

docs: add user-facing data catalog page (data.qmd)#146

Merged
rdhyee merged 1 commit intoisamplesorg:mainfrom
rdhyee:docs/data-page
Apr 24, 2026
Merged

docs: add user-facing data catalog page (data.qmd)#146
rdhyee merged 1 commit intoisamplesorg:mainfrom
rdhyee:docs/data-page

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented Apr 24, 2026

Summary

A public-facing companion to SERIALIZATIONS.md (#143). Where the catalog is internal reference, this is the researcher/developer landing.

Contents

  • Quick-pick table: "if you want to do X → use file Y"
  • Five copy-pasteable DuckDB snippets: Kyoto bbox on lite; source breakdown on wide; top H3 res-4 cells; OpenContext artifacts on sample_facets_v2; edge-predicate counts on narrow
  • H3 tier breakpoints for map authors
  • Cross-links to SERIALIZATIONS, QUERY_SPEC, PQG spec, conformance matrix, Zenodo deposition issue
  • Data-source + licensing note pointing to the iSamples Zenodo community

Verification

Every snippet on the page was executed against live `data.isamples.org` URLs on 2026-04-24 and returned non-empty results. Two original drafts (intro quick-start and §3.4) were corrected during verification:

  • Intro snippet was using `source` on the wide parquet; wide uses `n` (PQG convention). Fixed + flagged inline with a comment.
  • §3.4 was filtering OC samples for `object_type ILIKE '%pottery%'`; the object_type column holds vocabulary URIs, not English labels. Switched to `%artifact%` which matches a real URI fragment.

How this differs from PR #143

PR #143 (SERIALIZATIONS.md) This PR (data.qmd)
Audience Internal catalog Public researcher/dev
Contents Every file, full schema headline, upstream/downstream Top 5 files, quick-pick, one-liner queries
Rendering Markdown (internal) Quarto (rendered on isamples.org)

Both live alongside each other. Cross-linked.

Related

  • isamplesorg.github.io#137 — Strand landscape
  • isamplesorg.github.io#139 — Zenodo deposition plan
  • isamplesorg.github.io#143 — SERIALIZATIONS.md catalog
  • isamplesorg.github.io#144 — Repo listing expansion
  • isamplesorg.github.io#145 — QUERY_SPEC v0.1 + v0.2

🤖 Generated with Claude Code

A public-facing companion to SERIALIZATIONS.md (PR isamplesorg#143). Where the
catalog is internal reference ("every file with role, size, upstream,
consumers"), this page is the researcher/developer landing:

- Quick-pick table mapping "if you want to do X → use file Y"
- Five copy-pasteable DuckDB snippets (every one executed clean
  against live R2 URLs during authoring)
- H3 tier breakpoint reference for map authors
- Cross-links to SERIALIZATIONS, QUERY_SPEC, PQG spec, conformance
  matrix
- Data-source + licensing paragraph pointing to the Zenodo community
  (without speculating on specific license terms)

Lands at the site root alongside pubs.qmd and query-spec.qmd.

Note on column naming in snippets: the wide parquet uses `n` for the
source column (PQG convention); lite and sample_facets_v2 use the
friendlier alias `source`. Flagged inline in the snippet comment so
Binder/Colab first-timers don't trip on it.

Verified on 2026-04-24: all 6 snippets (incl. the callout quick-start)
execute against data.isamples.org, returning non-empty results.
@rdhyee rdhyee merged commit b954809 into isamplesorg:main Apr 24, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant